Overview

Dataset statistics

Number of variables12
Number of observations17703
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory96.0 B

Variable types

Numeric11
DateTime1

Warnings

gross_revenue is highly correlated with invoice_noHigh correlation
invoice_no is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
frequency is highly correlated with invoice_noHigh correlation
qtde_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
gross_revenue is highly correlated with recency_days and 5 other fieldsHigh correlation
recency_days is highly correlated with gross_revenue and 1 other fieldsHigh correlation
invoice_no is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency_days is highly correlated with gross_revenue and 3 other fieldsHigh correlation
frequency is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenueHigh correlation
avg_unique_basket_size is highly correlated with avg_ticketHigh correlation
gross_revenue is highly correlated with invoice_noHigh correlation
invoice_no is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_recency_days is highly correlated with invoice_no and 1 other fieldsHigh correlation
frequency is highly correlated with invoice_no and 1 other fieldsHigh correlation
frequency is highly correlated with gross_revenue and 1 other fieldsHigh correlation
gross_revenue is highly correlated with frequency and 1 other fieldsHigh correlation
qtde_returns is highly correlated with avg_basket_size and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with qtde_returns and 1 other fieldsHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
invoice_no is highly correlated with frequency and 1 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 90.51563435) Skewed
qtde_returns is highly skewed (γ1 = 51.67714053) Skewed
avg_basket_size is highly skewed (γ1 = 49.74912676) Skewed
df_index has unique values Unique
recency_days has 635 (3.6%) zeros Zeros
qtde_returns has 5482 (31.0%) zeros Zeros

Reproduction

Analysis started2021-05-28 00:25:16.595433
Analysis finished2021-05-28 00:25:31.067358
Duration14.47 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct17703
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10350.0871
Minimum0
Maximum20522
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:31.165765image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1050.1
Q15280.5
median10402
Q315428
95-th percentile19539.9
Maximum20522
Range20522
Interquartile range (IQR)10147.5

Descriptive statistics

Standard deviation5906.864597
Coefficient of variation (CV)0.5707067523
Kurtosis-1.186694207
Mean10350.0871
Median Absolute Deviation (MAD)5071
Skewness-0.01475611063
Sum183227592
Variance34891049.37
MonotonicityStrictly increasing
2021-05-27T21:25:31.273448image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
136921
 
< 0.1%
75611
 
< 0.1%
55121
 
< 0.1%
198431
 
< 0.1%
177941
 
< 0.1%
95981
 
< 0.1%
157411
 
< 0.1%
34511
 
< 0.1%
157251
 
< 0.1%
Other values (17693)17693
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
205221
< 0.1%
205211
< 0.1%
205201
< 0.1%
205191
< 0.1%
205181
< 0.1%
205171
< 0.1%
205151
< 0.1%
205141
< 0.1%
205131
< 0.1%
205121
< 0.1%
Distinct305
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size138.4 KiB
Minimum2016-11-29 00:00:00
Maximum2017-12-07 00:00:00
2021-05-27T21:25:31.384886image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:31.507330image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

customer_id
Real number (ℝ≥0)

Distinct2970
Distinct (%)16.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15228.20725
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:31.625685image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12627
Q113750
median15125
Q316745
95-th percentile17949
Maximum18287
Range5940
Interquartile range (IQR)2995

Descriptive statistics

Standard deviation1731.917094
Coefficient of variation (CV)0.1137308591
Kurtosis-1.220108827
Mean15228.20725
Median Absolute Deviation (MAD)1526
Skewness0.07732577092
Sum269584953
Variance2999536.821
MonotonicityNot monotonic
2021-05-27T21:25:31.738308image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14911144
 
0.8%
12748113
 
0.6%
17841113
 
0.6%
1531191
 
0.5%
1460688
 
0.5%
1308983
 
0.5%
1297171
 
0.4%
1642256
 
0.3%
1452755
 
0.3%
1379853
 
0.3%
Other values (2960)16836
95.1%
ValueCountFrequency (%)
123477
< 0.1%
123484
 
< 0.1%
123527
< 0.1%
123563
 
< 0.1%
123582
 
< 0.1%
123596
< 0.1%
123603
 
< 0.1%
1236213
0.1%
123644
 
< 0.1%
123704
 
< 0.1%
ValueCountFrequency (%)
182873
 
< 0.1%
1828314
0.1%
182823
 
< 0.1%
182772
 
< 0.1%
182762
 
< 0.1%
182742
 
< 0.1%
182733
 
< 0.1%
182727
< 0.1%
182703
 
< 0.1%
182692
 
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2964
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9161.586268
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:31.862233image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile360.01
Q11098.43
median2507.07
Q35899.335
95-th percentile37153.85
Maximum279138.02
Range279131.82
Interquartile range (IQR)4800.905

Descriptive statistics

Standard deviation25772.32458
Coefficient of variation (CV)2.8130854
Kurtosis53.31618939
Mean9161.586268
Median Absolute Deviation (MAD)1736.97
Skewness6.604744639
Sum162187561.7
Variance664212714
MonotonicityNot monotonic
2021-05-27T21:25:31.979959image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
140450.72144
 
0.8%
40967.72113
 
0.6%
32317.32113
 
0.6%
60767.991
 
0.5%
12021.6588
 
0.5%
58825.8383
 
0.5%
11189.9171
 
0.4%
34684.456
 
0.3%
8507.8255
 
0.3%
37153.8553
 
0.3%
Other values (2954)16836
95.1%
ValueCountFrequency (%)
6.22
< 0.1%
13.32
< 0.1%
152
< 0.1%
36.564
< 0.1%
452
< 0.1%
522
< 0.1%
52.22
< 0.1%
52.22
< 0.1%
62.432
< 0.1%
68.842
< 0.1%
ValueCountFrequency (%)
279138.0246
 
0.3%
259657.326
 
0.1%
194550.7929
 
0.2%
168472.52
 
< 0.1%
140450.72144
0.8%
124564.5316
 
0.1%
117379.6351
 
0.3%
91062.3833
 
0.2%
72882.0938
 
0.2%
66653.5617
 
0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.3940575
Minimum0
Maximum373
Zeros635
Zeros (%)3.6%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:32.100589image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q14
median16
Q343
95-th percentile175
Maximum373
Range373
Interquartile range (IQR)39

Descriptive statistics

Standard deviation59.21515027
Coefficient of variation (CV)1.54229988
Kurtosis8.092698024
Mean38.3940575
Median Absolute Deviation (MAD)14
Skewness2.710942182
Sum679690
Variance3506.434022
MonotonicityNot monotonic
2021-05-27T21:25:32.233916image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11551
 
8.8%
21091
 
6.2%
3908
 
5.1%
4771
 
4.4%
8681
 
3.8%
0635
 
3.6%
10529
 
3.0%
9528
 
3.0%
7501
 
2.8%
17434
 
2.5%
Other values (262)10074
56.9%
ValueCountFrequency (%)
0635
3.6%
11551
8.8%
21091
6.2%
3908
5.1%
4771
4.4%
5296
 
1.7%
7501
 
2.8%
8681
3.8%
9528
 
3.0%
10529
 
3.0%
ValueCountFrequency (%)
3734
 
< 0.1%
3729
0.1%
3712
 
< 0.1%
3682
 
< 0.1%
3669
0.1%
3654
 
< 0.1%
3642
 
< 0.1%
3603
 
< 0.1%
3592
 
< 0.1%
35810
0.1%

invoice_no
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.28870813
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:32.369826image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median7
Q315
95-th percentile57
Maximum206
Range205
Interquartile range (IQR)11

Descriptive statistics

Standard deviation28.96384421
Coefficient of variation (CV)1.778154779
Kurtosis23.88118339
Mean16.28870813
Median Absolute Deviation (MAD)4
Skewness4.552737187
Sum288359
Variance838.9042714
MonotonicityNot monotonic
2021-05-27T21:25:32.487420image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21800
 
10.2%
41719
 
9.7%
31635
 
9.2%
51271
 
7.2%
61118
 
6.3%
71002
 
5.7%
8833
 
4.7%
9676
 
3.8%
11615
 
3.5%
10571
 
3.2%
Other values (46)6463
36.5%
ValueCountFrequency (%)
1396
 
2.2%
21800
10.2%
31635
9.2%
41719
9.7%
51271
7.2%
61118
6.3%
71002
5.7%
8833
4.7%
9676
 
3.8%
10571
 
3.2%
ValueCountFrequency (%)
206113
0.6%
199144
0.8%
124113
0.6%
9783
0.5%
91179
1.0%
8671
 
0.4%
7246
 
0.3%
6290
0.5%
6026
 
0.1%
5753
 
0.3%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2970
Distinct (%)16.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.02014368
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:32.605248image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile5.226807859
Q113.99977723
median19.12316667
Q328.9025
95-th percentile114.5063732
Maximum56157.5
Range56155.34941
Interquartile range (IQR)14.90272277

Descriptive statistics

Standard deviation604.3794902
Coefficient of variation (CV)13.72961194
Kurtosis8395.639277
Mean44.02014368
Median Absolute Deviation (MAD)7.170833333
Skewness90.51563435
Sum779288.6036
Variance365274.5682
MonotonicityNot monotonic
2021-05-27T21:25:32.719624image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24.75775075144
 
0.8%
7.056183406113
 
0.6%
5.226807859113
 
0.6%
25.5434636491
 
0.5%
4.45576352988
 
0.5%
32.3574422483
 
0.5%
36.6882295171
 
0.4%
93.9956639656
 
0.3%
8.76191555155
 
0.3%
106.458022953
 
0.3%
Other values (2960)16836
95.1%
ValueCountFrequency (%)
2.1505882354
 
< 0.1%
2.43252
 
< 0.1%
2.4623711342
 
< 0.1%
2.5112413793
 
< 0.1%
2.5153333332
 
< 0.1%
2.652
 
< 0.1%
2.6569318182
 
< 0.1%
2.7075982533
 
< 0.1%
2.7606215728
< 0.1%
2.77046419114
0.1%
ValueCountFrequency (%)
56157.52
 
< 0.1%
4453.432
 
< 0.1%
3202.923
 
< 0.1%
1687.23
 
< 0.1%
952.98753
 
< 0.1%
872.133
 
< 0.1%
841.021449329
0.2%
651.16833334
 
< 0.1%
6404
 
< 0.1%
624.42
 
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1257
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.18228747
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:33.099637image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.271428571
Q117.33333333
median31
Q354.6
95-th percentile123
Maximum366
Range365
Interquartile range (IQR)37.26666667

Descriptive statistics

Standard deviation45.24071433
Coefficient of variation (CV)1.023955909
Kurtosis12.63302866
Mean44.18228747
Median Absolute Deviation (MAD)16.56
Skewness3.011411808
Sum782159.0351
Variance2046.722233
MonotonicityNot monotonic
2021-05-27T21:25:33.221490image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.601398601144
 
0.8%
3.330357143113
 
0.6%
3.321428571113
 
0.6%
3595
 
0.5%
4.14444444491
 
0.5%
4.27586206988
 
0.5%
4.47560975683
 
0.5%
7081
 
0.5%
14.7278
 
0.4%
22.37577
 
0.4%
Other values (1247)16740
94.6%
ValueCountFrequency (%)
132
 
0.2%
1.53
 
< 0.1%
226
 
0.1%
2.53
 
< 0.1%
2.601398601144
0.8%
331
 
0.2%
3.321428571113
0.6%
3.330357143113
0.6%
3.56
 
< 0.1%
450
 
0.3%
ValueCountFrequency (%)
3662
 
< 0.1%
3652
 
< 0.1%
3632
 
< 0.1%
3622
 
< 0.1%
3574
< 0.1%
3562
 
< 0.1%
3554
< 0.1%
3522
 
< 0.1%
3514
< 0.1%
3506
< 0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1350
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.07552646186
Minimum0.005449591281
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:33.359359image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.01290322581
Q10.02508960573
median0.04142011834
Q30.07608695652
95-th percentile0.25
Maximum3
Range2.994550409
Interquartile range (IQR)0.05099735079

Descriptive statistics

Standard deviation0.117333527
Coefficient of variation (CV)1.553541952
Kurtosis76.64809037
Mean0.07552646186
Median Absolute Deviation (MAD)0.02071005917
Skewness6.254189121
Sum1337.044954
Variance0.01376715655
MonotonicityNot monotonic
2021-05-27T21:25:33.480735image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.6514745308144
 
0.8%
0.5802139037113
 
0.6%
0.4530831099113
 
0.6%
0.315508021491
 
0.5%
0.337801608688
 
0.5%
0.320652173983
 
0.5%
0.237837837871
 
0.4%
0.0277777777871
 
0.4%
0.0357142857169
 
0.4%
0.0666666666766
 
0.4%
Other values (1340)16794
94.9%
ValueCountFrequency (%)
0.0054495912812
 
< 0.1%
0.0054644808742
 
< 0.1%
0.0054945054952
 
< 0.1%
0.0055096418732
 
< 0.1%
0.0055865921794
< 0.1%
0.0056022408962
 
< 0.1%
0.0056179775284
< 0.1%
0.005665722382
 
< 0.1%
0.0056818181824
< 0.1%
0.0056980056986
< 0.1%
ValueCountFrequency (%)
32
 
< 0.1%
22
 
< 0.1%
1.5714285713
 
< 0.1%
1.56
 
< 0.1%
128
 
0.2%
0.83333333333
 
< 0.1%
0.753
 
< 0.1%
0.666666666724
 
0.1%
0.6514745308144
0.8%
0.62
 
< 0.1%

qtde_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct215
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean142.6579676
Minimum0
Maximum80995
Zeros5482
Zeros (%)31.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:33.609742image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median6
Q336
95-th percentile446
Maximum80995
Range80995
Interquartile range (IQR)36

Descriptive statistics

Standard deviation1062.494385
Coefficient of variation (CV)7.447844678
Kurtosis3799.981426
Mean142.6579676
Median Absolute Deviation (MAD)6
Skewness51.67714053
Sum2525474
Variance1128894.317
MonotonicityNot monotonic
2021-05-27T21:25:33.742609image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05482
31.0%
1774
 
4.4%
2727
 
4.1%
4581
 
3.3%
3537
 
3.0%
5456
 
2.6%
6437
 
2.5%
12386
 
2.2%
8359
 
2.0%
18311
 
1.8%
Other values (205)7653
43.2%
ValueCountFrequency (%)
05482
31.0%
1774
 
4.4%
2727
 
4.1%
3537
 
3.0%
4581
 
3.3%
5456
 
2.6%
6437
 
2.5%
7261
 
1.5%
8359
 
2.0%
9303
 
1.7%
ValueCountFrequency (%)
809952
 
< 0.1%
936016
 
0.1%
90142
 
< 0.1%
800438
 
0.2%
442714
 
0.1%
37686
 
< 0.1%
3332144
0.8%
287829
 
0.2%
20229
 
0.1%
201224
 
0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1980
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean270.1811978
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:33.884842image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile52.5
Q1116.5833333
median188.5
Q3299.5
95-th percentile660.8627451
Maximum40498.5
Range40497.5
Interquartile range (IQR)182.9166667

Descriptive statistics

Standard deviation533.1357215
Coefficient of variation (CV)1.973252491
Kurtosis3667.449042
Mean270.1811978
Median Absolute Deviation (MAD)84.5
Skewness49.74912676
Sum4783017.745
Variance284233.6976
MonotonicityNot monotonic
2021-05-27T21:25:34.015862image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
403.3316583144
 
0.8%
123.8398058113
 
0.6%
185.9112903113
 
0.6%
419.714285791
 
0.5%
68.296703388
 
0.5%
320.309278483
 
0.5%
108.011627971
 
0.4%
660.862745156
 
0.3%
37.9454545555
 
0.3%
420.140350953
 
0.3%
Other values (1970)16836
95.1%
ValueCountFrequency (%)
14
< 0.1%
22
 
< 0.1%
3.3333333336
< 0.1%
5.3333333334
< 0.1%
5.6666666673
< 0.1%
6.1428571435
< 0.1%
7.54
< 0.1%
92
 
< 0.1%
9.52
 
< 0.1%
113
< 0.1%
ValueCountFrequency (%)
40498.52
 
< 0.1%
6009.3333332
 
< 0.1%
42822
 
< 0.1%
39063
 
< 0.1%
3868.6516
 
0.1%
28806
 
< 0.1%
28012
 
< 0.1%
2733.94444446
0.3%
2518.76923113
 
0.1%
2160.3333333
 
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1005
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.50874042
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size138.4 KiB
2021-05-27T21:25:34.150871image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.75862069
Q110.24242424
median17.25
Q326.6
95-th percentile53.75
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)16.35757576

Descriptive statistics

Standard deviation19.06981243
Coefficient of variation (CV)0.8866075864
Kurtosis51.04282444
Mean21.50874042
Median Absolute Deviation (MAD)7.75
Skewness4.806136608
Sum380769.2316
Variance363.657746
MonotonicityNot monotonic
2021-05-27T21:25:34.277105image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13200
 
1.1%
11146
 
0.8%
28.50753769144
 
0.8%
16138
 
0.8%
14114
 
0.6%
63.20967742113
 
0.6%
22.23300971113
 
0.6%
9112
 
0.6%
18107
 
0.6%
17102
 
0.6%
Other values (995)16414
92.7%
ValueCountFrequency (%)
197
0.5%
1.25
 
< 0.1%
1.255
 
< 0.1%
1.3333333337
 
< 0.1%
1.522
 
0.1%
1.56818181829
 
0.2%
1.5714285714
 
< 0.1%
1.66666666716
 
0.1%
1.8333333335
 
< 0.1%
283
0.5%
ValueCountFrequency (%)
299.705882417
0.1%
2592
 
< 0.1%
203.54
 
< 0.1%
1482
 
< 0.1%
1453
 
< 0.1%
136.12510
0.1%
135.54
 
< 0.1%
1272
 
< 0.1%
1224
 
< 0.1%
1183
 
< 0.1%

Interactions

2021-05-27T21:25:18.708411image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:18.804775image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:18.893630image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:18.981085image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.070390image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.153016image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.238689image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.327463image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.413854image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.505175image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.598680image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.689186image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.776131image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:19.866601image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.092921image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.185908image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.272921image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.362411image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.455306image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.546374image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.641580image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.737084image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.828422image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:20.915321image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.003584image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.092039image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.183221image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.270117image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.358320image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.449587image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.538837image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.632935image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.727462image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.818186image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:21.910726image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.004488image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.101192image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.197035image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.287672image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.380770image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.477345image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.571840image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.671276image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.771447image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.867173image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:22.949741image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.035218image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.119869image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.207714image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.288846image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.373146image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.619760image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.705392image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.796940image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.887596image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:23.974384image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.060130image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.148160image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.235770image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.326589image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.411292image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.498540image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.589384image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.678288image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.772084image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.865404image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:24.955431image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.047287image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.141136image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.235127image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.331802image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.423053image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.516504image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.617226image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.712116image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.812044image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:25.911953image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.008094image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.097606image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.196224image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.302168image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.406536image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.507516image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.612292image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.709500image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.805625image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:26.904115image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.002635image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.097111image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.197049image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.305838image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.414477image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.528009image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.634581image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:27.744199image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.035782image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.134811image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.238208image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.342903image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.443113image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.538628image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.636397image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.735176image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.836432image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:28.930772image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.027841image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.128253image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.226700image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.329695image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.433551image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.533164image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.623716image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.719187image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.819281image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:29.927469image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:30.015868image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:30.114136image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:30.215508image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:30.315727image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:30.421713image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-05-27T21:25:30.529231image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-05-27T21:25:34.420237image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-27T21:25:34.570789image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-27T21:25:34.710147image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-27T21:25:34.850725image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-05-27T21:25:30.718782image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-27T21:25:30.969990image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexinvoice_datecustomer_idgross_revenuerecency_daysinvoice_noavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
002016-11-29178505391.21372.034.018.15222235.5000000.48611140.050.9705888.735294
112016-11-29130473232.5956.09.018.90403527.2500000.04878035.0154.44444419.000000
222016-11-29125836705.382.015.028.90250023.1875000.04569950.0335.20000015.466667
332016-11-2913748948.2595.05.033.86607192.6666670.0179210.087.8000005.600000
442016-11-2915100876.00333.03.0292.0000008.6000000.13636422.026.6666671.000000
552016-11-29152914623.3025.014.045.32647123.2000000.05444129.0150.1428577.285714
662016-11-29146885630.877.021.017.21978618.3000000.073569399.0172.42857115.571429
772016-11-29178095411.9116.012.088.71983635.7000000.03910641.0171.4166675.083333
882016-11-291531160767.900.091.025.5434644.1444440.315508474.0419.71428626.142857
992016-11-29160982005.6387.07.029.93477647.6666670.0243900.087.5714299.571429

Last rows

df_indexinvoice_datecustomer_idgross_revenuerecency_daysinvoice_noavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
17693205122017-12-07173156252.181.036.013.30251110.2000000.11452532.0107.27777813.055556
17694205132017-12-07126623543.780.011.016.10809137.3000000.03208618.0185.54545520.000000
17695205142017-12-071670514034.990.020.051.98144417.9000000.08078018.0273.80000013.500000
17696205152017-12-07125261172.660.03.017.24500047.0000000.0315790.0208.00000022.666667
17697205172017-12-071758111045.040.025.025.10236424.8000000.08311066.0237.08000017.600000
17698205182017-12-071274832317.320.0206.07.0561833.3303570.5802141535.0123.83980622.233010
17699205192017-12-071377725977.160.033.0131.86375612.4333330.10695290.0390.8181825.969697
17700205202017-12-07158044206.390.013.016.05492411.6470590.09547752.0197.30769220.153846
17701205212017-12-071311312245.960.024.061.22980015.9130430.106267449.0126.8750008.333333
17702205222017-12-0712680790.810.04.016.13898037.6666670.0350880.0109.75000012.250000